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BASIC PATTERNS OF CHINESE CODES AND CIPHERS 
William T. Mau, B4 


When is a code not a code? When it's 
Chinese plain text. As many, but not 
ATG—AXB all, readers know, in ordinary com- 

_ munications the Chinese use a four- 
digit code to represent the thousands 
of characters of their written lan- 
guage. That code is the Standard 
Telegraphic Code (STC), a page of 
which is shown at left. Sinoe Chinese 
characters cannot be sent by tele- 
graph, thie satof digital equivalents 
was set up to make telecommmnications 
possible. (The four-digit groupe are 
what we usually see in traffio; the 
trigraphe below the characters are an 
alternate set of equivalente, rarely 
used.) 


Some vereion of this code has been in 
use in China ever since telegraphy 
was introduced there. The Chinese 
Hationaliete use the older "CTC" 
(Chinese Telegraphic Code, also known 
ag the "Ming Code" from the Chinese 
words on the cover of the book: MING 
MA, plain code), which contains older 
forms of many characters, and which 
reads from right to left rather than 
left to right. 
The STC book consists of 100 pages, . 
jon each of which ts a 10 2 10 matrix 
with single-digtt row and coiumn co- 
ordinates. In each cell ia one Chinese 
character, The characters are arranged 
in "radical/etroke” order (the moat 
common dictionary order in Chinese 

5 use}, with, unfortunately, some out- 
+t Ft & of-order exoeptiona. Tha basic 
AW LAW JIAWKIAWL u characters occupy the firat 79 pages; 
the remaining pages contain short 
forms, additional characters, and 
spectal tables, inoluding dates times, 
A and marks of punctuation, and the 

> Latin alphabet. 


O541 [0542 
ala 
AUU[AUVIAUW 


» 
<ht 8 


cag 


oO 
a 


> 
z 
z 





The root or base of a Chinese character ts the “radical.” There are 214 radicals, 
most of which are characters by themaetves: i.e., they have a meaning. But their 
meaning is extended or modified by the addition of one or more atrokes to form an- 
other character. The number of radicals was established about 1660, and their dic- 
tionary order is fixed by the number of strokes it takes to write each one--from a 


single stroke (== ) to 17 ¢ ). In STC each radical appears as a "section 


heading," and the charactera which follow it are arranged basically in ascending 
rumber of strokee added to the basic root. {Compare this with the English alphabet 
with ite 26 letters, only two of which have meaning when they stand alone.) 
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Thie article examines certain features of Chinese-language cryptographic code systema. 
Bach type ts treated separately, but in actual practice a code system often involves 


two or more of the methods of encryption discussed below. 


In addition, two or more 


methods for encoding values not in the vocabulary can be used in a eingle code system. 


BASIC CODE PATTERNS 


As we all learned in early training, codes 
can be one-part or two-part. One-part codes 
are so formatted that one book suffices for both 
encoding and decoding. In two-part codes the 
order of plain and code equivalents is so mixed 
that two books are required, one for encoding 
and one for decoding. A modified one-part code 
is one in which the regular pattern has been 
complicated in some way. Most Chinese codes are 
one-part. 


But one of the most important questions 
facing the codebreaker as he looks at a new 
Chinese code is this: 


Assuming that it is one-part, and that the 
values therefore occur in logical order, rather 
than scrambled, what is that logical order? In 
European languages it would be some form of 
alphabetic order, but Chinese has at least five 
possible ways of arranging a one-part code: 


Radical/Stroke Order 


The codes used by the People's Republic of 
China (PRC) may be arranged in the usual radi- 
cal/stroke order which parallels the STC book, 
but other systematic orders--even other radical/ 
stroke systems--are possible. Compounds can be 
inserted between single characters. For example, 
@ given row of ten values in the STC book is 
0100, OL01, 0102, 0103, etc., through 0109. The 
corresponding values in a radical/stroke order 
code might be 0100, 0100 0226, 0101, 0102, 

0102 7022, etc. 


Phonetic Order 


An alphabetic order for Chinese- language 
values can be achieved by one of several sys- 
tems of phonetic representation of characters. 
Strangely, the system usually seen is not the 
Pinyin, introduced in 1958 (with which we at 
NSA prefer to work for convenience’ sake), but 
the older, National Phonetic, system, designed 
in the early 1920's in imitation of the Japan- 
ese kana. STC books published in China contain 
these 37 phonetic symbols in the otherwise 
blank cells 9720--9756, Thus while the symbols 
in the names of Mao Ze Dong, Shanghai and Bei- 
jing (Peking/Peiching) would occur in the same 
order in a Pinyin listing as in English alpha- 
betical order: Bei, Dong, Hai, Jing, Shang, Ze, 
under the National Phonetic system they would 
be listed in the order: Bei, Mao, Dong, Hai, 
Shang, Ze. (This system is known, from the 
first four syllables in it, as the Bo-Po-Mo-Fo 
system.) 


| Total Strokes Order 


The same names might be ordered by the nun- 
ber of strokes with which they are written, using 
the number of strokes in the new (circa 1958) 
short forms where applicable. Thus arranged, 


they :would: read: Shang ck 3 strokes), 
mao (5 4 strokes), Bei ( .S strokes), 
Dong (, 5 strokes), Jing CHa 8 strokes), 


20 (54 8 strokes), Hai (AF 10 strokes). 


Characters with the same number of strokes may 
be arranged within that category by either 
phonetic or radical order. 


Sentential/Category Order 


Many PRC codes, especially military codes, 
are in the form of charts rather than books. 
Dimensions vary, but the 9 x 9 matrix is most 
common. The usual number of matrices in a code 
of this type is from six to nine. 


Such charts frequently use popular phrases 
and sentences to fill in the rows of the matri- 
ces, Categories such as time (year, month, day, 
hour) and points of the compass (east, south, 
West, north) are listed in adjoining cells in 
other matrices. (Of the characters used in our 
names example, above, "Dong," the literal mean- 
ing of which is east, and "Bei," literally 
north, would be found with south and west.) The 
sentential/category code is often referred to 
as "modified one-part" because it is “one-part" 
(requires only one book} to has someone who has 
memorized the sentences and has the chart in 
front of him. 





COMPARATIVE ORDER OF THE CHARACTERS IN THE NAMES 
"Mao Ze Dong,” "Shanghai," "Beijing" 





STC and Radical/ Pin- Nat'l Total Senten- 
Stroke yin Phon. Stroke tial 
SHANG (0006) _&. BEI BEI SHANG (3) MAO 
JING (0079) “DONG MAO MAO (4) ZE 
BEI (0554) 3K. HAI DONG BEI (5) DONG 
DONG (2639) & JING HAI DONG (S) SHANG 
MAO (3029) €, MAO JING JING (8) HAI 
| HAL (3189) $4 SHANG SHANG ZE (8) —_—BET 
ZE (3419) i ze ZE HAI (10) JING 
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or dinomes. Sometimes the code groups for digi- 
tal values are used, with a flag group, to 
"spell" the digits of an STC group. A device 
used in pre-Communist codes was a table of radi- 
cals to be added to the character represented 
by the preceding code group. Since the spell 
tables can represent the Chinese language fully, 
they give the effect of a cipher within a code, 


EXAMPLE OF SENTENTIAL ARRANGEMENT 















acta ween Beatin 


Line 10: All units are to pay attention to 
concealing their equipment and strengthening 
air defense. Do not expose targets. 


Sine 11: Our advance is blocked. Request 
artillery aupport to help in completing our 
miasion. 


Line 12: The situation haa suddenly changed. 
Casualties are heavy. 


line 13: Your unit ia directed to complete 
combat preparations at once. 


114 132 105 116 
HR BH wd AH 


Request your-unit increase support. 


ENCODED 
TEXT: 


PROVISION FOR ADDITIONAL VALUES 


In practice, the reconstruction of codes | 
is facilitated by the fact that the code voca 
jary is not all-inclusive. Some of the charac- 
ters in a given message may not be in the code 
vocabulary at all, and the code clerk must have, 
some means of encoding the absent characters. 
There are several ways to deal with "missing" 
characters, and each makes the bookbreaker's 
task a bit easier, Among them are the phonetic 
spell table, phonetic variation, character 
construction and dissection, and enciphered STC. | 


Spell Tables 


Chinese codes often contain a subsystem for, 
“spelling out" characters which do not occur in 
the code vocabulary, Most common are phonetic | 
values or substitution tables for STC monomes 


|| equivalent of the group sent. 





Phonetic Representation 


A chart containing all the initials and 
finals that compose National Phonetic represen- 
tations of Chinese characters peruits the user 
to spell the sound of a missing character. Re- 
construction of a chart such as the one which 
follows greatly assists the bookbreaker's re- 
covery .of values in the basic system. (For our 
own convenience, we at NSA write the cell con- 
tents in Pinyin instead of using the actual 
phonetic symbols.) 


rae tendon TT 
[8 [se] u] tofenspeon 





Another way to use sound rather than mean- 
ing is phonetic variation. A special flag group 


Y indicates that the group which follows is not 


to be read as its true value but that the char- 
acter intended is a homonym of the plaintext 
For example, 


in the phrase 8490 , the flag group 

8490 indicates that the next group (Zhong) 

is being used instead of the intended character 
» Which is also pronounced "Zhong" but 


which is not in the code vocabulary. Often one 
|| digit in the flag group will change to reflect 
a specific tone among the the four tones that 
the intended character has. The flag thus more 
exactly identifies the intended character among 
its homonyms. 
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Character Construction and Dissection 


Using flag groups to select only a part of 
the preceding character (the part itself being 
a separate character) is known as character dis- 
section. Character construction, on the other 
hand, involves flag groups that instruct the 
recipient to "take part A of character X and 
add it to part B of character Y to create the 
intended character." An example of dissection 
might break out to "Take the right-hand side of 


HAL ( 4 }," which gives us MEI ( ). An 
example of character construction would state: 
oqake the left-hand side of HAI (}@F ) and add 
it to the character JING ( )," which gives 
us LIANG (Yds). 


i. 


Enciphered STC 


Flags might also indicate the presence of 
STC groups within the text of coded messages. 
Such a flag would indicate that the group which 
follows is not a value in the code but is the 
STC group representing the character intended. 
This would be equivalent to inserting plaintext, 
so the STC group is often enciphered by an addi- 
tive or by transposition. 


CIPHER SYSTEMS 


There have been some substitution ciphers 
which assigned cipher equivalents to phonetic 
values, but enciphered Chinese plain text, in 
NSA parlance, is usually some encipherment of 
STC. 


Additive, substitution, and transposition 
are all used. An additive cipher may be as 
simple as adding a four-digit constant stutter 
to each group, turning 6153 0132 0932 0171 


( os) 1@ 13) } into 7264 1243 1043 1282 


by the addition of 1111; it may be as secure as 
a@ one-time running key in which a different 
four-digit group is added to each plain STC 
group; or it may be some intermediate method. 

































Another Chinese cipher not infrequently 
encountered is repaginated STC. The repagins~ 
tion may be merely an end-around shift of the 
page-number sequence or it may involve a random 
scramble of the page numbers. Either shift or 
Scramble may extend to the coordinates of row 
or column, or both, on each page. 


Local transposition within a group or the 
insertion of nulls can disguise the basic STC 
group. Thus with transposition cabd, 6153 0132 


o9sz 171 (HR 4 12) AR) decones 5613 3012 


3092 7011. Nulls can stretch each group to five 
digits. Inserting a 0 between b and c in the 
groups of our example gives us 61053 01032 
09032 01071. In practice, transposition and 
insertion of nulls are often combined, so that, 
inthe example given, the original message would 
become 56013 30012 30092 70011. 


While it might be argued that the additive 
and repaginations belong under "codes" rather 
than "ciphers," we include them under ciphers 
because the same basic extensive vocabulary is 
used. Chinese codes selectively restrict this 
basic vocabulary. 


Systems such as those discussed here have 
been used by the PRC in its communications. 
Knowing some of these methods certainly lightens 
the work of the bookbreaker. As another writer 
has said, "Chinese codebooks are nothing more 
than a compilation of written characters, ex- 
pressed numerically for the purpose of tele- 
graphic communications. The real clue to the 
structure of a code lies in the arrangement of 
these characters." 


EPOR-CEGREE HERA} 


Non - Responsive 





Can you 
make out the name? 


“A real-life puzzle GLENN EMERY 
submitted by P1é 
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We know that in a certain non-English 
codebook the roman alphabet occurs as shown on 
a code page consisting of a 10 x 10 matrix (see 
above). The initial dinome of a code group re- 
presents the page, the final dinome column and 
row, respectively. 


A message is received which contains the 
following groups in mid-text: 


(3742 3792 3732 3767) (3742 3767 3732 377S 3772) 


Parentheses have been observed in other 
messages setting off groups which represent spe- 
cial categories of information supplemental to 
the main body of the code, and context indicates 
that the parenthetical groups in this message 
represent the name of a powdered milk available 
in the Southeast Asian market. So it is suspec- 
ted that the groups are two words using the roman 
alphabet, whose page has been renumbered as 37. 


What are the words, and which column and 
row coordinates can be recovered? 


(Solution next month.) 
La i oar 
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